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Thursday, September 22, 2016 


Q&A with CNI’s Clifford Lynch: Time to re- 
think the institutional repository? 


(A print version of this interview is available here) 


Seventeen years ago 25 people gathered in Santa Fe, New Mexico, to 
discuss ways in which the growing number of e-print servers and digital 
repositories could be made interoperable. 


As scholarly archives and repositories had begun to proliferate a number of 
issues had arisen. There was a concern, for instance, that archives would 
needlessly replicate each other’s content, and that users would have to learn 
multiple interfaces in order to use them. 

It was therefore felt 
there was a need to 
develop tools and 
protocols that would 
allow repositories to 
copy content from 
each other, and to 
work in concert on a 
distributed basis. 


With this aim in mind 
those attending the 
New Mexico event — 
dubbed the Santa Fe 
Convention for the 
Open Archives Initiative (OAI) — agreed to create the (somewhat wordy) 
Open Archives Initiative Protocol for Metadata Harvesting, or OAI- 

PMH for short. 


Photo courtesy Susan van Hengstum 


Key to the OAI-PMH approach was the notion that data providers — the 
individual archives — would be given easy-to-implement mechanisms for 
making information about what they held in their archives externally 
available. This external availability would then enable third-party service 
providers to build higher levels of functionality by using the metadata 
harvesting protocol. 


The repository model that the organisers of the Santa Fe meeting had very 
much in mind was the physics preprint server arXiv This had been created 
in 1991 by physicist Paul Ginsparg, who was one of the attendees of the 
New Mexico meeting. As a result, the early focus of the initiative was on 
increasing the speed with which research papers were shared, and it was 
therefore assumed that the emphasis would be on archiving papers that had 
yet to be published (i.e. preprints). 


However, amongst the Santa Fe attendees were a number of open access 
advocates. They saw OAI-PMH as a way of aggregating content hosted in 
local — rather than central — archives. And they envisaged that the archived 
content would be papers that had already been published, rather than 
preprints. These local archives later came to be known as institutional 
repositories, or IRs. 


In other words, the OA advocates present were committed to the concept of 
author self-archiving (aka green open access). The objective for them was to 
encourage universities to create their own repositories and then instruct their 
researchers to deposit in them copies of all the papers they published in 
subscription journals. 


As these repositories would be on the open internet outside any paywall the 
papers would be freely available to all. And the expectation was that OAI- 
PMH would allow the content from all these local repositories to be 
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aggregated into a single searchable virtual archive of (eventually) all 
published research. 


Given these different perspectives there was inevitably some tension around 
the OAI from the beginning. And as the open access movement took off, 
and IRs proliferated, a number of other groups emerged, each with their 
own ideas about what the role and target content of institutional repositories 
should be. The resulting confusion continues to plague the IR landscape. 


Moreover, today we can see that the interoperability promised by OAI-PMH 
has not really materialised, few third-party service providers have emerged, 
and content duplication has not been avoided. And to the exasperation of 
green OA advocates, author self-archiving has remained a minority sport, 
with researchers reluctant to take on the task of depositing their papers in 
their institutional repository. Given this, some believe the IR now faces an 
existential threat. 


In light of the challenging, volatile, but inherently interesting situation that 
IRs now find themselves in I decided recently to contact a few of the Santa 
Fe attendees and put some questions to them. My first two approaches were 
unsuccessful, but I struck third-time lucky when Clifford Lynch, director of 
the Washington-based Coalition for Networked Information (CNI), agreed 
to answer my questions. 


I am publishing the resultant Q&A today. This can be accessed in the pdf 
file here. 


As is my custom, I have prefaced the interview with a long introduction. 
However, those who only wish to read the Q&A need simply click on the 
link at the head of the file and go directly to it. 


Posted by Richard Poynder at 12:13 Das] 


12 comments: 


Anonymous said... 


Part of the problem is that the same few voices are chiming in year after 
year and the analysis is coming from the think tanks and not enough 
from those on the ground. If you were to ask IR managers what the 
barriers are, they might list a finite set of obstacles that could be 
overcome with some more resources and some clear-headed 
community-wide coordination. A few include: pushing back against the 
stranglehold that the publishers’ claim over copyright, permissions, and 
licensing of scholarly work; largely mediating deposit; and hiring the right 
people (and enough of them) to do the work in the IR, i.e. those who 
understand publishing production. What is lacking in the discussion now 
is what we have to lose if we just throw up our hands and give up on the 
IR experiment. Allowing the scholarly communication lifecycle to be 
driven by solely by hyper-monetized means does a disservice to authors 
and to readers. Have we really come to the point of saying: Oh, Elsevier 
and a few others do it all so well, why should we even bother? They don't 
really do it very well, they just do it financially successfully. Our values 
may be skewing toward hyper-monetization of everything we do (IRs, 
libraries, university presses, all together), but | suspect that we will rue 
the day when we let go of our small, individualized contributions to the 
publishing stream. We are seeing the Whole Foods/Walmartization of 
scholarly communication. 


September 23, 2016 12:57 pm 


Mike Taylor said... 
@ (Note: | have not read the interview yet, just the blog-post.) 


| too have my reservations about institutional repositories; but now 
seems a very strange time to be p[redicting their downfall, what with all 
the new institutional mandates and climbing compliance rates. Five 
years ago it would have made more sense. (And in five more years, it 
might again.) 


September 23, 2016 1:42 pm 


Stevan Harnad said... 


(A print version of this eBook 
is available here ) Earlier this 
year | was invited to discuss 
with Georgia Institute of 
Technology libraria... 
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Just for the record: | have never said or thought that "the only purpose of 
the IR is to provide a platform on which researchers can post copies of 
the papers they have published in subscription journals, thereby freeing 
them from the “subscription firewall”.". 


| said (more often than | would care to recall) that "the primary purpose 
of the IR is to... (etc.). 


On the contrary, | myself proposed several secondary purposes, 
including impact metrics and research evaluation (and indeed predicted 
and preached that once Green OA prevailed, it would force journals to 
downsize to Fair-Gold OA, with all access-provision and archiving 
offloaded onto the worldwide network of Green OA IRs). 


My argument against confusing IRs with (peer-reviewed) research 
publishers was peer review itself, which is not the métier either of 
librarians or IR-managers. 


(My disagreements with Cliff go way back, too, but | have no interest in 
disinterring them.) 


Erstwhile Archivangelist 


September 23, 2016 6:07 pm 


T Scott said... 


The quote of mine that you include in the intro was made in the context 
of a talk | gave at the NASIG meeting in June, titled "Dialectic: The Aims 
of Institutional Repositories". The title is a riff off a comment Lynch made 
in the forward to the recently published book "Making Institutional 
Repositories Work." In that talk, | track the impact of the contrasting 
views of IRs as given in the Lynch and Crow papers that you mention, to 
come up with some recommendations about how to refocus what the 
role of IRs might usefully be. I've linked to the video of the talk, as well 
as the transcript and slides in a blog post here: 
http://tscott.typepad.com/tsp/2016/09/dialectic-the-future-of-institutional- 
repositories.html 


September 23, 2016 10:31 pm 


Tony Ross-Hellauer said... 


The calls for a fundamental rethink of repositories is already being 
answered! See the ongoing work of the COAR next-generation 
repositories working group: https://www.coar- 
repositories.org/activities/advocacy-leadership/working-group-next- 
generation-repositories/ 


Also, OpenAlRE's long-term future a sustainable infrastructure for open 
science in Europe and beyond is increasingly secure. We'll be 
establishing ourselves as a legal entity within the next 6-12 months. 


September 26, 2016 5:35 am 


Richard Poynder said... 


Thank you for this Tony. The vision and priorities page you point to does 
not strike me as a fundamental rethink of the institutional repository, but 
essentially more of the same. 


| did interview COAR's Kathleen Shearer in 2014 (here). In doing so, | 
did not get a sense that COAR has taken a leadership role in rethinking 
the IR, but rather is trying to keep up with what is happening in practice. 
Consequently, Shearer seemed a little reluctant to define what an IR is, 
and exactly what role it should play, which should perhaps have been a 
starting point. 


Shearer said "[l]n practice, repository services and infrastructures are 
diverse and there is a lot of overlap with other systems. Perhaps most 
significantly, practices and technologies are changing quickly, making it a 
challenge to concretely define their services. My feeling is that we need 
to be flexible in the way we conceptualize repositories." 


As | see it, what new initiatives are emerging are coming from 
commercial publishers, although it is true that these are generally based 
on projects originally started by the research community. What | guess is 
key is that publishers have the money to turn ideas into reality (although 
of course this money did come from the research community in the first 
place!) 
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That said, | applaud the efforts of organisations like OpenAIRE and 
COAR and wish them every success. | also wish you good luck in 
achieving sustainability for OpenAIRE. 


September 26, 2016 7:59 am 


David B. Lowe said... 


As for the trajectory of IRs, the mission-critical reason that IRs will be 
with us for the foreseeable future in higher education is not related to OA 
publishing as such, but instead to ETDs (electronic theses and 
dissertations). ETD workflows have many moving parts that require 
approval and buy-in along a chain of scholarly and bureaucratic offices, 
so once they are agreed upon and put in place, they achieve an instant 
inertia of their own, having replaced the paper paths that led to them. 
Documentation is the coin of the realm, but a return to paper is highly 
unlikely, leaving the IR as a pretty stable bet. ETDs are an anchoring 
pillar since they enable the imprimatur of every institution of higher 
learning, namely the terminal degrees conferred. It's the very reason we 
exist. 


As for the trajectory of OA, the Max Planck Society plans are welcome 
news indeed and | look forward to hearing more about them, but for an 
operating model of re-engineering the funding arrangement from 
traditional publishing in one research community to OA at the journal 
level, we may turn to the related SCOAP3 project, which dates from 
2007 and which presumably will serve as the foundational case study for 
the Max Planck effort. SCOAP3 was recently renewed for another 3-year 
phase that includes 8 journals. My hope is that, as a next step, we may 
begin to convince professional society publishers that the same sort of 
model can work for them. 


September 27, 2016 8:48 pm 


Anonymous said... 


As a publishing researcher | can second the comment by Richard. All 
this is not really offering a new way and more like reacting to the flow. 
Maybe that has to do with the kind of people working on it, the IR crowd 
is usually coming from the library field and their job is not to be inventive 
but to archive and keep stuff save. No offense meant, quite the contrary. 


What bugs me is that platforms that are really on to something like 
ScienceOpen or ResearchGate are either in very close cooperation with 
publishers or with the advertisement industry. Both are not healthy 
partners for this topic, to say it decently. 


| would love to see librarians take a more active role here because these 
are people | trust. 


September 28, 2016 12:00 pm 


Unknown said... 


(part 1) 


“The reports of our death have been greatly exaggerated” (to paraphrase 
Mark Twain) 


Although | agree with some of what Richard Poynder writes in the 
introduction to his recent interview with Cliff Lynch published on 
September 22, 2016, | do take exception to a number of the assertions 
he makes about the current state of IRs, especially his comments that 
green OA has failed (although this is clearly what the publishers would 
have us believe). 


It is true that repositories have not yet completely fulfilled their potential, 
and there are efforts to shift the transition to open access through APC- 
based gold OA. However, this is a critical time for IRs. The global 
network is now at a point where we have an international mechanism to 
communicate with each other (COAR) and we are consolidating around 
a common vision and strategy for repositories. 


In the last 3 months | have been traveling extensively in Europe, Latin 
America and China. All of these regions are investing in repository 
infrastructure to support open access, are working actively to improve 
interoperability across regions, and are establishing regional and/or 
national networks for repositories. In this respect, the United States is an 
outlier, since it has yet to leverage the strategic value of its institutional 
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repositories through developing a national network. | hope this will 
change in the near future. 


As Poynder alludes to in his introduction, highly centralized systems are 
far easier to launch, nurture and promote, however, there are significant 
benefits to a distributed system. It is much less vulnerable to buy-out, 
manipulation, or failure. Furthermore, a global network, managed 
collectively by the university and research community around the world, 
can be more attuned to local values, regional issues and a variety of 
perspectives. Repositories do have the potential to change scholarly 
communication, but there is some urgency that we start to build greater 
momentum now. 


Recognizing the current challenges and opportunities for repositories, 
COAR launched a working group in April 2016 to identify priority 
functionalities for the next generation of repositories. In this activity, our 
vision is clearly articulated, 


"To position distributed repositories as the foundation of a globally 
networked infrastructure for scholarly communication that is collectively 
managed by the scholarly community. The resulting global repository 
network should have the potential to help transform the scholarly 
communication system by emphasizing the benefits of collective, open 
and distributed management, open content, uniform behaviors, real-time 
dissemination, and collective innovation.” 


Ultimately, what we are promoting is a conceptual model, not a 
technology. Technologies will and must change over time, including 
repository technologies. We are calling for the scholarly community to 
take back control of knowledge production process via a distributed 
network based at scholarly institutions around the world. 

The aim of our next generation repositories working group is to better 
integrate repositories into the research process and make repositories 
truly ‘of the web, not just on the web’. Once we do that, we can support 
the creation of better, more sophisticated value added services. 


September 28, 2016 1:55 pm 


Unknown said... 


(part 2) 


In his comments, Poynder also talks about the lack of full text content in 
repositories and cites one example, the University of Florida, which is 
working with Elsevier to add metadata records. However, one repository 
does not make a trend and COAR does not support this type of model. 
The vast majority of repositories focus on collecting full text content and 
the primary raison d’etre of repositories has always been and remains to 
provide access to full text articles, and other valuable research outputs, 
so they can be re-used and maximize the value and impact of research. 


Poynder also mis-characterizes many of the centralized services 
aggregating repository content saying they “appear (like SSRN) to be 
operated by for-profit concerns”. On the contrary, there are numerous 
examples of not-for-profit aggregators including BASE, CORE, 
SemanticScholar, CiteSeerX, OpenAIRE, LA Referencia and SHARE (I 
could go on). These services index and provide access to a large set of 
articles, while also, in some cases, keeping a copy of the content. 


And finally, Poynder’s comments about the current protocol used for 
interoperability, OAI-PMH, are somewhat misleading. OAI-PMH was a 
child of its time (1999) and was pretty good at what it was supposed to 
do at the time. However, it is out of date and we need a new approach; 
the OAI has proposed ResourceSync, based on Sitemaps, for discovery 
and synchronization of repository resources. A major outcome for the 
COAR Next Generation Repositories Working Group will be 
recommendations about new standards for repository interoperability. 


And so, there is an African proverb that | often quote in my presentations 
about the future of repositories, ‘If you want to go fast, go alone. If you 
want to go far, go together’. Indeed, it has taken longer than we had 
anticipated to coalesce around a common vision in a distributed, global 
environment, but we are now well positioned to offer a viable alternative 
for an open and community led scholarly communication system. 


Kathleen Shearer, Executive Director, COAR 
September 28, 2016 1:56 pm 


Unknown said... 


As an Open Access (OA) advocate and (disclaimer) someone who works 
for a repositories’ aggregation service (CORE https://core.ac.uk/ - a non- 
profit service that caches the aggregated content and maintains a fairly 
large collection) your introduction described a rather too gloomy picture 
for the purpose of repositories and their future. It is agreed that OAI-PMH 
has disadvantages, but it has served the field well for quite some time 
now. Having said that, | am very much looking forward to see COAR’s 
next-generation repositories working group conclusions. In addition, to 
me it is dreadful to consider that Gold OA is the future, especially for 
commercial publishers; it is an expensive route to OA - for some 
commercial publishers it is even too expensive - and asking from 
taxpayers to sustain it for a long period of time is not to their benefit. We 
have to accept that commercial companies/publishers will get into the 
OA arena, acquire a small amount of OA products, like SSRN, and 
perhaps shift their OA character. 


Nonetheless, the beauty of the repositories, especially the institutional, 
lies within the fact that they are growing within academic institutions and 
can be used as live archives for the institution, they can host an 
institution’s “fruits”; from research papers to courses’ syllabus and from 
organisational bureaucratic documents to outdated webpages. They can 
serve as a portfolio to demonstrate a researcher’s work, as a research 
impact tool for the university, as a mean to text mine content and as a 
tool where OA content can be discovered from everyone around the 
world for free. It is my strong belief that we don’t need to abandon 
repositories, on the contrary, we have work harder to improve their 
functionalities based on the current needs. 


September 28, 2016 3:36 pm 


Richard Poynder said... 


Many thanks to those who posted the above comments. | have 
responded to them in a new post here. 


October 05, 2016 2:06 pm 
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